{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "tutorial.ipynb의 사본",
      "version": "0.3.2",
      "provenance": [],
      "collapsed_sections": [],
      "toc_visible": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "accelerator": "GPU"
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "A9v6gNRNF2QS",
        "colab_type": "text"
      },
      "source": [
        "# Example code for training network with knowledge distillation\n",
        "This tutorial explains the code to train network with knowledge distillation. And to make easy to train a new model  yourself this tutorial is using Google Colab.\n",
        "\n",
        "Most of Knowledge distillation algorithm's training procedure is categorized in two manners. \n",
        "\n",
        "The first manner is initializing student network by teacher's knowledge such as FitNet, FSP, AB and so on. So their training procedure is composed by training teacher network, initializing student network and finetuning student network.\n",
        "\n",
        "The second manner is multi-task learning which learn main-task and teacher's knowledge at the same time such as Soft-logits, AT, KD-SVD and so on. And their training procedure is composed by training teacher network and training student network with teacher knowledge.\n",
        "\n",
        "But to make both training procedure possible by just one training code, I combine the initializing step and the finetuning step."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "bYxZQrgyKROU",
        "colab_type": "text"
      },
      "source": [
        "# Cloning the Github codes\n",
        "The first step is cloning Github code repo. After running the bellow code you will find codes in 'File' tap."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "trlir3CQG1Sv",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "!git clone https://github.com/sseung0703/Knowledge_distillation_methods_wtih_Tensorflow.git"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "KNm4Qv6gGA3x",
        "colab_type": "text"
      },
      "source": [
        "#Training teacher network\n",
        "The next step is training the teacher network. teacher network is trained without any Distillation method. And define the main scope name as 'Teacher' to make easy to assign teacher parameters.\n",
        "below code is example to train new teacher network.\n",
        "\n",
        "In Google colab it takes about 2 hours. So if you have not enough time, you can skip this pass and use the trained parameter I uploaded."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "ORMTub2KJljo",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "!python Knowledge_distillation_methods_wtih_Tensorflow/train_w_distill.py --Distillation=None --train_dir=test --main_scope=Teacher"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ZnnsjlNyKGep",
        "colab_type": "text"
      },
      "source": [
        "When training is done, you can find the trained parameter which named 'train_params.mat' in training directory.\n",
        "So copy that file to 'pre_trained' directory. So copy that file to 'pre_trained' directory by bellow code."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "pFrkaRI7Mgfb",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "! cp test/train_params.mat Knowledge_distillation_methods_wtih_Tensorflow/pre_trained/ResNet_teacher.mat"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "w0VUzEWcMyG2",
        "colab_type": "text"
      },
      "source": [
        "#Training Student Network with Teacher's Knowledge\n",
        "Finally,  we can train a student network with teacher knowledge. To use teacher network's parameter we have to define the name of the saved parameter and Distillation method.\n",
        "For example, I define name as ResNet_teacher which defined above and Distillation method as RKD which is the latest method in my experiments. In my experiment, KD-SVD is the best method, but KD-SVD is the slowest one in implemented methods. And Google colab's memory bottleneck is worse than real hardware. So if you want to try KD-SVD you should try in your own's hardware.  \n",
        "\n",
        "And if you use provided weights, remove teacher FLAG.\n",
        "\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "YsMHosFhN7YC",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "!python Knowledge_distillation_methods_wtih_Tensorflow/train_w_distill.py\\\n",
        " --Distillation=RKD --train_dir=kdsvd --main_scope=Student_w_KD-SVD --teacher=ResNet_teacher"
      ],
      "execution_count": 0,
      "outputs": []
    }
  ]
}